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Abstract 

Massive scleractinian corals of the genus Pontes are important reef builders in the Indo-Pacific, and they are more resistant 
to thermal stress than other stony corals, such as the genus Acropora. Because coral health and survival largely depend on 
the interaction between a coral host and its symbionts, it is important to understand the molecular interactions of an entire 
"coral holobiont". We simultaneously sequenced transcriptomes of Pontes australiensis and its symbionts using the lllumina 
Hiseq2000 platform. We obtained 14.3 Gbp of sequencing data and assembled it into 74,997 contigs (average: 1,263 bp, 
N50 size: 2,037 bp). We successfully distinguished contigs originating from the host (Pontes) and the symbiont 
{Symbiodinium) by aligning nucleotide sequences with the decoded Acropora digitifera and Symbiodinium minutum 
genomes. In contrast to previous coral transcriptome studies, at least 35% of the sequences were found to have originated 
from the symbionts, indicating that it is possible to analyze both host and symbiont transcriptomes simultaneously. 
Conserved protein domain and KEGG analyses showed that the dataset contains broad gene repertoires of both Pontes and 
Symbiodinium. Effective utilization of sequence reads revealed that the polymorphism rate in P. australiensis is 1.0% and 
identified the major symbiotic Symbiodinium as Type CI 5. Analyses of amino acid biosynthetic pathways suggested that this 
Pontes holobiont is probably able to synthesize most of the common amino acids and that Symbiodinium is potentially able 
to provide essential amino acids to its host. We believe this to be the first molecular evidence of complementarity in amino 
acid metabolism between coral hosts and their symbionts. We successfully assembled genes originating from both the host 
coral and the symbiotic Symbiodinium to create a snapshot of the coral holobiont transcriptome. This dataset will facilitate a 
deeper understanding of molecular mechanisms of coral symbioses and stress responses. 
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Introduction 

Coral reefs are estimated to harbor roughly one-third of all 
described marine species, and their productivity supports approx- 
imately one-quarter of marine fisheries. The major architects of 
coral reefs, the scleractinian corals, are anthozoan cnidarians that 
form obligate endosymbioses with photosynthetic dinoflagellates of 
the genus Symbiodinium. The unicellular symbionts are harbored in 
the host coral's gastrodermal (endodermal) tissue in intracellular 
vacuoles known as symbiosomes [1], which are thought to 
originate from the plasma membrane of host cells during the 
initial acquisition of symbionts by a phagocytic process [2]. 
Although many of the details of the interaction between the host 
and Symbiodinium remain to be explored, this association enables 
the massive rates of calcification that distinguish reef-building 
corals from other anthozoans, such as sea anemones and 
zoanthids. 



Coral reefs face a range of environmental changes, including 
ocean acidification, seawater temperature increases, and declines 
in coral abundance. Extensive loss of reef habitats is one of the 
most pressing environmental issues of our time [3-5] . Recently, 
increasing instances of "coral bleaching" have been observed. In 
most cases, coral bleaching is a breakdown of the mutualism 
between the coral and the photosynthetic dinoflagellate, resulting 
from a stress response to environmental perturbation. The 
integrity of the coral holobiont - a complex symbiosis between 
the coral animal, its endosymbiotic zooxanthellae, and an 
associated community of microorganisms is essential for main- 
taining coral health [6]. The molecular mechanisms underlying 
the collapse of symbiosis and the bleaching response are complex 
and are still littie understood. Bleaching is currendy viewed as a 
host response to a compromised symbiont that is analogous to the 
innate immune responses that occur in other host-microbe 
interactions. Expulsion or elimination of the symbiont from host 
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tissues is thought to involve a variety of mechanisms, including 
exocytosis, host cell detachment, and host cell apoptosis [7] . 

Genomic information for cnidarians has been accumulating; the 
genomes of two non-symbiotic cnidarians, the anemone Mematos- 
tella vectensis (Anthozoa) [8], and a hydra, Hydra magnipapillata 
(Hydrozoa) [9], have been sequenced. The genome of a 
scleractinian coral, Acropora digitifera, was decoded using next- 
generation sequencing (NGS) technology [10]. The latter genome 
was estimated to comprise 420 Mbp and contains about 23,700 
predicted protein-coding genes. Recendy NGS-based transcrip- 
tome datasets have become available for several anthozoan 
cnidarians including several coral species: Acropora millepora 
[11,12], A. palmata [13] and Pocillopora damicornis [14], and a sea 
anemone Aiptasia pallida [15]. In addition, transcrip tomes of two 
cultured strains of Symbiodinium have been reported [16] and the 
nuclear genome of S. minutum was decoded [17]. In order to better 
understand the molecular interactions, it is worthwhile to 
simultaneously capture the molecular states of all organisms 
comprising a coral holobiont. 

The massive stony corals of the genus Pontes are common, 
important reef builders in the Indo-Paciftc Ocean. More than 80 
named species and numerous unclassified forms have been 
identified [18]. In contrast to Acropora species, Pontes corals 
generally transmit their symbionts direcdy from parents to 
offspring (vertical transmission) rather than acquiring them anew 
from the environment (horizontal transmission) in each generation 
[19]. Pontes species have thicker tissues and appear more robust to 
thermal stress than other corals, such as Acropora, that have thinner 
tissues [20]. Thus Porites corals have been used for comparative 
analyses of stress responses [21]. Laboratory studies indicate that 
acidified seawater reduces calcification of Porites corals [22-24]. 
Geochemical tracers, such as oxygen isotope ratios, strontium- 
calcium ratios, and heavy metal concentrations in growth rings of 
the CaCO-j Porites skeleton have been used to monitor changes in 
sea surface temperature, salinity, and/ or marine pollutants [25- 
28] . By such means, Porites taxa have been used to reconstruct past 
environmental changes so as to better understand tropical climate 
systems and to predict climate change [29,30]. Nevertheless, 
molecular information about Porites corals is still limited. To 
address this deficiency and to improve the utility of Porites as an 
environmental indicator, we sequenced the transcriptome of Porites 
australiensis and its symbiotic algae (Figured 1) using next 
generation technology (Illumina HiSeq2000) and constructed a 
transcriptome dataset that contains a large proportion of a 
symbiotic alga. 




Figure 1. The scleractinian coral, Porites australiensis. (A) A P. 
australiensis colony used in this study. The diameter is approximately 
20 cm. (B) High magnification photo of the P. australiensis. Dot circle 
indicates one polyp and tiny brown particles are symbiotic algae, 
dinoflagellates, Symbiodinium. 
doi:1 0.1 371 /journal.pone.00851 82.g001 



Materials and Methods 

RNA isolation, Transcriptome sequencing and assembly 

The coral sample used in this study was collected at Sesoko 
island, Okinawa, Japan, under the Okinawa prefecture permission 
(Number: 20-69). A small Porites australiensis colony that has been 
maintained in the Sesoko research station at the University of the 
Ryukyus for five years (Figured 1) was used in this study. A 
fragment (2 cm diameter) from the coral was snap frozen in liquid 
nitrogen and pulverized with an iron mortar and pestle. Total 
RNA was isolated using an RNeasy RNA extraction kit 
(OJAGEN). Total RNA was then fragmented into about 200 bp 
lengths and an RNA-seq library was prepared using a TruSeq 
RNA Sample Prep Kit v2 (Illumina). cDNA normalization using a 
duplex-specific nuclease was also performed. The library was 
sequenced (100 bp paired-end reads) using the Illumina HiSeq 
2000 platform. Library preparation, cDNA normalization, and 
sequencing were outsourced to Hokkaido System Science Corpo- 
ration, Japan. Low quality bases (Phred quality value, QV— 20) 
were trimmed from the raw data and read pairs of at least 80 bp 
were retained using SolexaQA [31]. Possible PCR duplicates that 
originated during library preparation were removed with Con- 
DeTri [32]. Contamination with TruSeq adapter sequences was 
removed by ea-utils (http://code.google.eom/p/ea-utils). Subse- 
quently high-quality paired end reads were assembled with 
Velvet/Oases software [33,34] using different K-mer sizes 
[35,45,55] and merged by K-mer 27 using Oases [34], For each 
word size, the longest isofbrm of each locus was selected. In 
addition, in order to recover low-expression genes and to produce 
longer assemblies, different coverage cut-off values (5, 8, 10 and 
12) were also applied. Contigs originating from different coverage 
cutoff values were concatenated and redundant sequences and 
transcript-variants were removed by CDHIT-EST [35] using 
default parameters. Contigs over 200 bp length were retained. 
Additionally, contigs containing translated Open Reading Frames 
(ORFs) in which 95% of the amino acids were identical were 
considered duplicates. In such cases, one of the duplicates was 
removed using CDHIT [35]. 

Separation of host or symbiont-originated sequences 

To distinguish whether contig sequences originated from host 
coral or symbiont Symbiodinium, nucleotide sequences of the 
assembled contigs were aligned to the genome sequences of both 
coral A. digitifera [10] and dinoflagellate S. minutum [17] by 
BLASTN, and contigs that show nucleotide similarities with A. 
digitifera or S. minutum were identified. Contigs aligned with the 
both genomes were not annotated in this study. Contigs that 
aligned only to the A. digitifera genome were annotated as "Porites 
contigs" while those that aligned only to the S. minutum genome 
were annotated as "Symbiodinium contigs", respectively. Different e- 
value cutoffs (le - , le -2 , le _i , le -4 , le -5 and le -10 ) were 
examined and a cutoff that maximized the number of Porites or 
Symbiodinium contigs was adopted (Table S3). We also applied the 
same analyses for reported coral transcriptome data from A. 
millepora [12], A. palmata [13], and P. damicornis [14], in order to 
investigate proportions of Symbiodinium sequences in each dataset. 

Annotation of transcriptome assembly 

Assembled transcriptome data were annotated as follows: 1) by 
BLAST homology searches against public protein databases: 
NCBI non-redundant protein sequences (NR) and Swiss-Prot [36], 
2) by assignment of Gene Ontology (GO) terms [37], 3) by 
mapping to pathways using the KEGG annotation service KAAS 
[38], and 4) conserved protein domain searches with Pfam [39]. 
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BLASTX homology searches were conducted against the Swiss- 
Prot and NCBI NR protein databases and an e-value cutoff of 
le 5 was applied. For GO annotation, GO IDs were assigned 
based on the UniProt IDs of the best matches in the Swiss-Prot 
database [36]. Then Generic GO slim terms were assigned using 
map2slim.pl in go-perl (http://search.cpan.org/~cmungall/go- 
perl/). For KAAS pathway annotation and analysis, we used the 
bi-directional best-hit (BBH) method to query the set of organisms 
representative for 'eukaryotes' as suggested on the KAAS website, 
using default settings. To screen and identify conserved protein 
domains, we used the Pfam database (Pfam-A.hmm, release 24.0; 
http://pfam.sanger.ac.uk) [39]. Translated amino acid entries 
matching conserved domains were identified using HMMER 
searches (hmmer3) [40]. In order to avoid eliminating Symbiodi- 
nium- or coral-specific rapidly evolving domains, we used an e- 
value cutoff of le , as proposed by Kawashima et al. [41]. For 
assessing completeness of the Pontes contigs, mutual best-hit blast 
analyses (TBLAST, BLASTX, le -5 ) were performed against the 
sea anemone Nematostella vectensis proteome [8] in order to identify 
possible orthologs between Pontes and Nematostella. We used 
Nematostella as a reference proteome in order to compare number 
of orthologous pairs between A. digitifera [10] and Nematostella with 
those of Pontes and Nematostella. Alignment coverage of each Pontes 
contig across full-length amino acid sequence of its possible 
Nematostella ortholog were investigated based on the BLAST 
search. For Symbiodinium contigs, S. minutum proteome dataset [1 7] 
was used for identifying orthologous pairs. 

Estimation of polymorphism in Pontes 

For estimating the polymorphism rate of P. australiensis, high- 
quality, trimmed Illumina reads (see above) were re-mapped 
against possible host-originated contigs {Pontes contigs) using the 
Burrows- Wheeler Aligner (BWA) [42] and SNPs (single nucleotide 
polymorphisms) in each contigs were detected. Because BWA 
allows gaps in alignments, we set the maximum indel size to 5 bp 
(bwa aln -n 0.05; since the maximum read length is 100 bp) in 
order to detect small indels. SNPs and small indels were called 
using SAMtools software packages [43]. To ensure reliable and 
high quality variant calling, SNPs and small indels were called only 
for positions with a minimal mapping quality (-OJ of 25, a 
coverage value (-d) of 10, and a maximum read depth (-D) of 200 
using the varFilter command in the SAMtools package. 

Identification of the symbiotic Symbiodinium type 

Illumina sequence data that originated from the Symbiodinium 
internal transcribed spacer 2 (ITS-2) region of the nuclear 
ribosomal array were identified using DNA sequences of 
Symbiodinium-spedfic ITS-2 primers used for genotyping: itsD 
and ITS-2rev2 [44]. Sequences reads containing these primer 
sequences were detected using BLAST (BLASTN) and 100% 
matched sequences were retrieved. Then these were subsequently 
assembled into a contig using Phrap [45] with default parameters. 
Sequence variations within the assembled ITS-2 sequence were 
checked with BWA [42] and SAMtools [43] as mentioned above. 

Results and Discussion 

Sequencing and assembly of the transcriptome 

Total RNA was isolated from a single colony of P. australiensis 
(Figured 1A), and then a normalized cDNA library was prepared 
and sequenced with the Illumina HiSeq2000 system. We obtained 
71 million paired-end sequences (14.3 Gbp, Table SI). Raw 
sequence data were submitted to the DDBJ Sequence Read 
Archive (DRA) under accession number DRA000906 (BioProject 



ID: PRJDB731). Subsequently, 45 million quality-trimmed read 
pairs (QV— 20, both paired read &80 bp and PCR duplicates and 
TruSeq adapter sequences removed, approximately 9 Gbp, Table 
SI) were assembled using Velvet/Oases software (Figureo 2, 
Tabled 1). Contigs of <200 bp were discarded because the RNA 
was fragmented into approximately 200-bp lengths during library 
preparation (see Material and Methods) and these short contigs 
were most likely to be truncated. After exclusion of redundant 
sequences and transcript variants, we obtained 74,997 contigs 
without ambiguous sequence gaps, totaling 94.73 Mbp (Figureo 2, 
Tabled 2). Although the size distribution was weighted toward 
smaller contigs, there were 31,199 contigs with lengths > 1 ,000 bp, 
1,952 contigs >5,000 bp, and 198 contigs >10,000 bp, respec- 
tively (Figureo 3A). The maximum contig length reached 
54,796 bp; the average was 1,263 bp, and the N50 size was 
2,037 bp (Tableo 2), indicating that the assembly is of comparable 
or better quality than extant anthozoan transcriptome assemblies 
using NGS platforms (Table S2). Assembled sequences have been 
submitted to the DDBJ/EMBL-Bank/GenBank Transcriptome 
Shotgun Assembly (TSA) Database under accession numbers 
FX435232-FX505330 and FX799345-FX804242. 

Dissecting the "holobiont" transcriptome 

Because total RNA was isolated from an adult P. australiensis 
colony (Figureo 1A), the transcriptome assembly contains genes 
from both corals and their zooxanthellae symbionts (Figureo IB). 
When the GC content distribution of all contigs in the assembly 

RNA extraction from Porites "holobiont" 




RNA sequencing by HiSeq2000 

- 100 bp paired-end sequencing 

- PCR duplication removal 
-Adapter trim 

- Quality trim (QV > 20, at least 80 bp length on both reads) 




Transcriptome assembly 

- Velvet/Oases assembler 

- Contigs longer than 200 bp retained 

74,997 contigs 
(mixture of host and symbiont transcriptome) 




— Separation of host or symbiont-originated contigs — 

Alignment to decoded genome sequences (BLASTN, 1e~ 4 ) 

- scleractenian coral, Acropora digitifera (420 Mbp) 

- dinoflagellate, Symbiodinium minutum (616 Mbp) 




26,658 26,627 
Porites contigs Symbiodinium contigs 



Figure 2. Flowchart of transcriptome sequencing, assembly, 
and separation of host or symbiont sequences. 

doi:10.1371/joumal.pone.0085182.g002 
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Figure 3. Analyses of the Porites australiensis "holobiont" transcriptome assembly. (A) Length distribution of transcripts in the 
transcriptome assembly. Yellow boxes indicate all contigs. Blue and brown lines indicate Pontes contigs and Symbiodinium contigs, respectively. (B) 
Distribution of GC percentages of the assembled contigs. Black line: all contigs, blue dot line: Porites contigs, brown dot line: Symbiodinium contigs. 
(C) Comparison of proportions of Symbiodinium sequences between the scleractinian coral transcriptome assemblies. Each sequence was aligned to 
A digitifera and S. minutum genome sequences by BLASTN (1e~ 5 ). Note that the high percentage of A digitifera sequences shared with the A 
millepora and A palmata datasets occur because these three corals are congeneric. 
doi:1 0.1 371 /journal.pone.00851 82.g003 



Table 1. Summary of the Porites australiensis "holobiont" 
transcriptome assembly. 



Number of unique sequences 


74,997 


Total basepair (Mbp) 


94.73 


Average (bp) 


1,263 


N50 size (bp) 


2,037 


Maximum contig length (bp) 


54,796 


Number of Porites cotigs 


26,658 (35.5%) 


Total basepair (Mbp) 


44.57 


Average (bp) 


1,672 


N50 size (bp) 


2,960 


Maximum contig length (bp) 


44,581 


Number of Symbiodinium contigs 


26,627 (35.5%) 


Total basepair (Mbp) 


32.01 


Average (bp) 


1,202 


N50 size (bp) 


1,571 


Maximum contig length (bp) 


12,901 



doi:1 0.1 371 /journal.pone.00851 82.t001 



was analyzed, two clear peaks of approximately 40% and 53% 
were detected (Figured 3B), suggesting that these peaks possibly 
originated from the host (P. australiensis) and the symbiont 
(Symbiodinium sp.), respectively. The GC % of A. digitifera exons is 
about 40% [10] and that of a sea anemone, Nematostella, is similar 
[8], suggesting that a GC content of about 40% may be typical for 
anthozoan cnidarians. Therefore the 40% peak probably reflects 
P. australiensis genes. Taking into consideration the mean GC % of 
exons in the Symbiodinium minutum genome is about 50% [17], the 
secound peak might corresponds to the contigs from Symbiodinium. 

Next we tried to identify contigs in the assembly that originated 
with Porites or Symbiodinium. We found that nucleotide alignment to 
both the Acropora and Symbiodinium genomes by BLASTN 
effectively separated them. In this study, we adopted an e-value 
cutoff le 4 (Table S3). 30,446 sequences hit against the A. digitifera 
genome and 30,415 sequences against S. minutum genome (3,788 
sequences sequences overlapped) respectively (Table S3). We 
subsequently removed overlapping sequences, and annotated 
26,658 sequences as "Porites contigs" and 26,627 sequences as 
"Symbiodinium contigs". Annotated contigs comprised about 70% 
of all contigs and GC distributions of Porites and Symbiodinium 
contigs clearly matched the two peaks detected in the whole 
assembly (Figured 3B), indicating that we effectively separated 
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Pontes and Symbiodinium genes in the assembly. These annotated 
contigs were used for further analyses (Tableo 2, Figured 3, 4). 

In most transcriptome analyses of anthozoans [12,13,15], 
aposymbiotic materials were used for sequencing in order to 
minimize Symbiodinium contamination. Thus, we suspect that 
Symbiodinium sequences in those transcriptome assemblies have 
been under-detected. It has also been reported that RNA of adult 
P. damicornis colonies containing Symbiodinium was extracted, 
sequenced on a 454 platform and assembled, but that few 
Symbiodinium sequences had been isolated [14]. In contrast with 
previous studies, a large proportion of Symbiodinium sequences (at 
least 26,627 contigs, 35% of total assembled sequences) were in the 
transcriptome assembly (Figureo 3C). We confirmed that coral 
transcriptome data reported to date contained very few Symbiodi- 
/zranj-originated sequences (Figureo 3C). Therefore this study 
represents the first simultaneous transcriptome profile of both 
counterparts of coral holobiont. 

Transcriptome annotation 

For transcriptome annotation, BLAST homology searches 
against protein databases (NCBI NR and Swiss-Prot) were 
performed. Among 74,997 contigs, 33,935 (45.2%) have signifi- 
cant similarities (BLASTX, le" 5 ) with NCBI NR and 25,893 
(34.5%) with Swiss-Prot database entries, respectively. As men- 
tioned above, 30,446 sequences showed similarity to the A. digitifera 
genome and 30,415 sequences to the S. minutum genome, 
respectively (Tableo 2). Taken together, we were able to annotate 
60,516 (80.7%) sequences using public protein databases (NCBI 
NR and Swiss-Prot) and A. digitifera [10] and S. minutum [17] 
genomic data. The remaining 14,481 sequences (total 7.12 Mbp) 
could not be annotated by BLAST and their average length 
(49 1 . 7 bp) was much shorter than that of the total assembly. These 
may be fragmented regions of other contigs, assembly artifacts, or 
bacterial contaminants. Some show similarities to coral microsat- 
ellite sequences deposited in the NCBI nucleotide (NT) database 
(data not shown). 

Conserved protein domains in the contigs were identified with 
hmmer3 [40] using the Pfam domain database [39]. We identified 
6,081 Pfam domains in 30,002 contigs (Tableo 2). We were able to 
assign 5,001 KEGG ORTHOLOGY (KO) IDs to 7,452 contigs 
(Tableo 2) using the KAAS BBH method [38]. 

Based on a BLAST search against the Swiss-Prot database, we 
were able to assign at least one GO term to the 1 1,048 Pontes and 
9,915 Symbiodinium contigs, respectively. These 20,963 contigs were 
grouped into 1 0 1 categories in the second hierarchy of the Generic 
GO slim. Figures SI, S2, S3 show GO annotation results for three 
GO sub-ontologies (cellular component, biological process, and 
molecular function). Since Pontes corals are multicellular animals, 
cell death, neurological system process, cell proliferation, cell 
adhesion, and cell-cell signaling terms are more heavily repre- 
sented than in Symbiodinium (2 x more, Figure SI). DNA binding, 
enzyme binding, and transcription factor binding activity also 
seem to be higher in Pontes (Figure S2). These may reflect the fact 
that 464 Pontes contigs possess transcription factor domains (Table 
S4), whereas only 28 Symbiodinium contigs do (data not shown). On 
the other hand, transcripts of genes related to photosynthesis, 
transport, and generation of energy are more abundant among 
Symbiodinium contigs (Figure SI). 

Estimating transcriptome completeness 

About 44% of Pontes contigs contain conserved protein domains 
and 4,528 unique Pfam domains were identified (Tableo 2). In the 
gene models of the A. digitifera genome, unique 3,684 Pfam 
domains were detected [10], indicating that the Pontes contigs 
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Figure 4. Amino acid metabolism in Pontes holobiont. (A) Schematic drawing of amino acid exchange between a coral polyp and symbiotic 
Symbiodinium. (B) Summary of amino acid biosynthesic capabilities of Pontes (host) and Symbiodinium (symbiont) inferred from KEGG annotation. 
Numbers indicate how many amino acids can be produced by Pontes or Symbiodinium, and numbers in blankets show numbers of essential amino 
acids. Blue circle indicates Pontes and brown circle indicates Symbiodinium, respectively. (C) Detail annotations of amino acid biosynthetic pathways 
in the Pontes holobiont. Arrows indicate enzymes involved in each amino acid biosynthetic pathway. Enzymes that are not annotated as Pontes or 
Symbiodinium are shown with a black line, and those not detected in the transcriptome dataset are shown with dotted lines. Blue indicates enzymes 
identified in Pontes contigs. Brown indicates enzymes detected in Symbiodinium contigs. Brown arrows boxed in blue lines indicate enzymes detected 
in both Pontes and Symbiodinium. Amino acids that can be produced by Symbiodinium are boxed in brown and those that can be produced by Pontes 
are boxed in blue, respectively. Amino acids boxed in dotted lines indicate that biosynthetic capabilities are not clear. 
doi:10.1371/journal.pone.0085182.g004 



probably contain a comparable variety of protein domains. In 
addition, we performed KEGG annotation and 3,633 unique KO 
IDs were identified among the Pontes contigs (Tabled 2). We 
detected 3,684 unique KO IDs in the A. digittfera proteome using 
the KAAS BBH method. A comparable number of KO IDs might 
indicate that, although RNA-seq was performed using a single 
RNA library, the majority of P. australiensis genes were successfully 
recovered in the transcriptome assembly. However it could be 
possible that those numbers in Pontes contigs are overestimated due 
to incomplete clustering of transcript variants in the assembly. 

An increasing number of studies have recently identified genes 
that encode transcription factors and signal transduction molecules 
in corals and have determined their roles in cnidarian evolution 
[46-50]. Therefore, for further assessing the completeness of the 
Pontes transcriptome, the repertoire of transcription factors and 
signal transduction molecules was compared with those of other 
cnidarians whose genomes have been decoded (Table S4, 5). We 
detected comparable numbers of transcription factors and 
signaling molecules among the Pontes contigs to what we previously 
found in the A. digittfera genome [10], e.g. Pontes: 89 homeobox, 24 
HMG box and 15 wnt genes; A. digittfera: 97 homeobox, 26 HMG 
box and 15 wnt genes (Table S4, 5). Next we identified 
orthologous gene pairs between Pontes contigs and Nematostella by 
mutual-best-hit blast analysis; 7,825 pairs were detected. The 
similarities of the A. digittfera and Nematostella genomes resulted in 
8,416 orthologous pairs, suggesting that Pontes contigs represent 
the majority of the gene repertoire for Pontes australiensis. 6,324 
orthologous genes out of 7,825 have more than 80% alignment 
coverage across Nematostella orthologs, suggesting that most of the 
Pontes contigs recover full-length open reading frames. 

On the other hand, 3,569 Pfam domains and 1,352 KO IDs 
were detected among Symbiodinium contigs (Tabled 2), while 4,035 
Pfam domains and 2,622 KO IDs were detected in the S. minutum 
gene model [17], respectively. We also identified 28 Symbiodinium 
contigs containing transcription factor domains (data not shown). 
It has been reported that in transcriptome assemblies from two 
cultured strains of Symbiodinium, 156 and 87 sequences, respective- 
ly, contained at least one known transcription factor domain [16]. 
The S. minutum genome was found to contain 32 transcription 
factor genes [17]. We detected 15,914 orthologous pairs between 
the Symbiodinium contigs and S. minutum genome. In S. minutum 
genome, 23,487 orthologous groups were identified by compared 
with 150 genomes [17]. This might reflect that not all genes were 
expressed at the sampling point. 

Polymorphism rate in Porites 

Since total RNA was extracted from a single Porites colony, it is 
possible to estimate the polymorphism rate of P. australiensis within 
an individual. High quality Illumina reads were re-mapped against 
the reference Porites contigs using BWA software [42], and high 
quality SNPs and small indels (~5 bp) within the Porites colony 
were identified using SAMtools [43]. Among the 44,569,031 
nucleotides of the 26,658 Porites contigs, 425,728 SNPs and 1 1,788 
small indels (total 20,213 bp) were detected. Taken together, the 



polymorphism rate between haplotypes, including small indels (less 
than 5 bp) and SNPs in the exons of Porites transcripts is estimated 
to be 1.0%, which is slightly higher than that observed in other 
cnidarian genomes (A. digittfera [0.4%, Shinzato and Mungpakdee, 
unpublished data], Nematostella [0.65%, 8], and Hydra [0.69%, 9]. 

Symbiont type 

The genus Symbiodinium displays tremendous taxonomic diver- 
sity with nine divergent lineages. Clades A-I have been described 
in Symbiodinium based on nuclear ribosomal DNA (rDNA) and 
chloroplast 23S rDNA [51]. Each clade contains multiple genetic 
types, often resolved using the internal transcribed spacer 2 (ITS-2) 
regions [44,52]. We failed to detect assembled sequences of 
nuclear rDNA or chloroplast 23S rDNA in our dataset. 
Accordingly, the ITS2 region-containing sequence reads were 
identified and assembled, and then we managed to reconstruct an 
ITS-2 region (341 bp) of the symbiotic Symbiodinium. BLAST 
homology searches against the NGBI NT database revealed that 
this sequence shows high similarity (340/341 bp identity) with 
clade C Symbiodinium sp. C15 (GenBank: JN558044). We also 
checked the sequence variation within the ITS2 region by BWA 
mapping and SAMtools, but no variation was detected (data not 
shown), indicating that the C15 type, a type common in Porites 
species across the Pacific [53], is the major Symbiodinium type in the 
P. australiensis colony used in this study. 

Amino acid biosynthesis pathways in Porites holobiont 

Bacteria, plants, and many fungi are able to synthesize all of the 
20 amino acids commonly found in proteins, whereas all animals 
studied to date either lack the ability to synthesize one or more of 
these amino acids, or else they are unable to synthesize quantities 
sufficient to meet their metabolic needs. These amino acids are 
termed "essential" and must be obtained from the diet. For 
vertebrates, eight or more amino acids are essential; threonine, 
valine, methionine, leucine, isoleucine, phenylalanine, lysine, and 
tryptophan are required by all vertebrates, while arginine and/ or 
histidine are also essential in some cases [54] . Although amino acid 
biosynthesis occurs via conserved pathways and is relatively well- 
characterized [55], amino acid exchange in cnidarian photoauto- 
trophic symbioses is poorly understood. The issue of amino acid 
requirements is complicated by the presence of symbiotic algae 
[56,57]. For example, examination of the amino acid biosynthetic 
capacity of the sea anemone Aiptasia pulchella - a facultative host of 
Symbiodinium sp. - suggested that seven amino acids (histidine, 
isoleucine, leucine, lysine, phenylalanine, tyrosine, and valine) are 
synthesized by the symbiotic algae and translocated to the sea 
anemone, and that methionine and threonine are likely to be 
synthesized by A. pulchella itself [57]. A genome-wide survey of 
amino acid biosynthetic pathway components in A. digittfera 
revealed that Acropora corals may be able to synthesize ten non- 
essential amino acids, but not cysteine [10], however amino acid 
metabolism in coral holobionts is still unclear (Figured 4A). We 
reconstructed the amino acid biosynthetic pathways in the Porites 
holobiont based on KEGG IDs (Figured 4B, C). In this study we 
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took amino acid biosynthetic pathways missing one or two 
enzymes into consideration. Most of enzymes involved in amino 
acid biosynthetic pathways were detected in the transcriptomes of 
Pontes and Symbiodinium, while four enzymes cannot be detected in 
the lysine anabolic pathway (Figured 3C). 

Interestingly many enzymes involved in essential amino acid 
biosynthesis are detected only in Symbiodinium contigs (Figured 3B, 
Figured 3G left side). Overrepresentation of "transport" GO terms 
in Symbiodinium contigs (Figure SI) implies that Symbiodinium 
transport essential amino acids to host cells. In contrast, enzymes 
for non-essential amino acid pathways were detected in both 
Symbiodinium and Pontes (Figured 3B, Figured 3C, right side). 
Glutamic acid, glutamine, aspartic acid, and alanine are probably 
produced in both Pontes and Symbiodinium. According our 
prediction, asparagine is synthesized in Symbiodinium while cysteine 
is synthesized in Pontes (Figured 3). We have shown previously that 
Acropora corals lack an essential enzyme for cysteine biosynthesis, 
cystatione B-synthase [10], however, Pontes seems to posses it 
(Figured 3C), indicating that it does not depend upon Symbiodinium 
for cysteine biosynthesis and might account for its greater 
resilience to environmental stresses. An interesting example is 
the methionine biosynthesis pathway. Half of the enzymes reside 
in the Pontes and Symbiodinium contigs, respectively, suggesting that 
methionine might be produced by intimate cooperation between 
host and symbiont in coral holobionts. While transcriptome 
sequencing was based upon a single RNA-seq library in this study, 
we were able to recover a surprisingly large gene repertoire for 
amino acid biosynthesis. Our data appear to provide the first 
direct molecular evidence of complementarity and syntrophy 
between coral hosts and their symbionts in amino acid metabo- 
lism. 

Conclusions 

We sequenced a Pontes australiensis holobiont and demonstrated 
that decoded genomic data of an Acropora coral [10] and a 
Symbiodinium minutum [17] greatly assisted the characterization of 
assembled contigs from a mixture of RNAs from different 
organisms. In this study we were able distinguish about 70% of 
contigs as host or symbiont. The assembled sequences contain a 
wide variety of genetic information from both the coral and its 
symbiont, Symbiodinium, including genes for most enzymes in all 
amino acid biosynthetic pathways. When genome sequences of P. 
australiensis and its symbiotic Symbiodinium are available, the 
annotation could be further improved. The Pontes holobiont 
transcriptome dataset allows us to utilize Porites australiensis, an 
abundant coral species from the Indo-Pacific, to reveal molecular 
mechanisms of coral symbiosis and coral stress responses. Our 
approach provides an opportunity to simultaneously analyze coral- 
symbiont interactions on transcriptomic level. Furthermore, 
analyzing the molecular bases of Pontes calcification may enable 
researchers to improve the accuracy of future climate change 
prediction. 

Supporting Information 

Figure SI Analysis of GO term enrichment of the 
"biological process" category for Porites and Symbio- 
dinium contigs. GO terms containing at least 100 sequences are 
shown. The Y-axis represents proportions of contigs in each 
category of GO-assigned contigs (Porites: 9806, Symbiodinium: 9147). 
Blue bars indicate Porites contigs. Red bars indicate Symbiodinium 
contigs. 
(PDF) 



Figure S2 Analysis of GO term enrichment of the 
"cellular component" category for Porites and Symbio- 
dinium contigs. GO terms containing at least 100 sequences are 
shown. The Y-axis represents proportions of contigs in each 
category of GO-assigned contigs (Porites: 9806, Symbiodinium: 9147). 
Blue bars indicate Porites contigs. Red bars indicate Symbiodinium 
contigs. 
(PDF) 

Figure S3 Analysis of GO term enrichment of the 
"molecular function" category for Porites and Symbio- 
dinium contigs. GO terms containing at least 100 sequences are 
shown. The Y-axis represents proportions of contigs in each 
category of GO-assigned contigs (Porites: 9806, Symbiodinium: 9147). 
Blue bars indicate Porites contigs. Red bars indicate Symbiodinium 
contigs. 
(PDF) 

Table SI Summary of the sequencing data. 

(PDF) 

Table S2 Summary of published anthozoan transcrip- 
tome assemblies. References. 1. Moya A, Huisman L, Ball EE, 
Hayward DC, Grasso LC, Chua CM, Woo HN, Gattuso JP, Foret 
S, Miller DJ: Whole transcriptome analysis of the coral Acropora 
millepora reveals complex responses to C0 2 -driven acidification 
during the initiation of calcification. Mol Ecol 2012, 21:2440- 
2454, 2. Polato NR, Vera JC, Baums IB: Gene discovery in the 
threatened elkhorn coral: 454 sequencing of the Acropora palmata 
transcriptome. PLoS One 2011, 6:e28634. 3. Traylor-Knowles N, 
Granger BR, Lubinski TJ, Parikh JR, Garamszegi S, Xia Y, Marto 
JA, Kaufman L, Finnerty JR: Production of a reference 
transcriptome and transcriptomic database (PocilloporaBase) for 
the cauliflower coral, Pocillopora damicornis. BMC Genomics 20 1 1 , 
12:585. 4. Lehnert EM, Burriesci MS, Pringle JR: Developing the 
anemone Aiptasia as a tractable model for cnidarian-dinoflagellate 
symbiosis: the transcriptome of aposymbiotic A. pallida. BMC 
Genomics 2012, 13:271. 
(PDF) 

Table S3 Results of nucleotide sequence alignment of 
the assembled sequences with different e-value settings 
of BLASTN against Acropora digitifera and Symbiodi- 
nium minutum genome sequences. We selected e-value cut- 
off as le 4 in this study as the number of Porites contigs is the 
largest. 
(PDF) 

Table S4 Comparison of the number of genes with 
transcription factor-related domains of Porites austra- 
liensis, Acropora digitifera, Nematostella vectensis and 
Hydra magnipapillata. 

(PDF) 

Table S5 Comparison of the number of genes with 
signaling molecule-related domains of Porites austra- 
liensis, Acropora digitifera, Nematostella vectensis and 
Hydra magnipapillata. 

(PDF) 

File SI List of Porites contigs (tab-separated text; contig 
names and NCBI accessions). 

(ZIP) 

File S2 List of Symbiodinium contigs (tab-separated 
text; contig names and NCBI accessions). 

(ZIP) 
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File S3 Annotation files for all contigs (tab-separated 
text; contig name, NCBI accession, Swiss-Prot blast 
result, Pfam ID, KEGG ID). 

(ZIP) 
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